Igf-bagging: Information Gain Based Feature Selection for Bagging

نویسندگان

  • Gang Wang
  • Jian Ma
  • ShanLin Yang
  • S. L. YANG
چکیده

Bagging is one of the older, simpler and better known ensemble methods. However, the bootstrap sampling strategy in bagging appears to lead to ensembles of low diversity and accuracy compared with other ensemble methods. In this paper, a new variant of bagging, named IGF-Bagging, is proposed. Firstly, this method obtains bootstrap instances. Then, it employs Information Gain (IG) based feature selection technique to identify and remove irrelevant or redundant features. Finally, base learners trained from the new sub data sets are combined via majority voting. Twelve datasets from the UCI Machine Learning Repository are selected to demonstrate the effectiveness and feasibility of the proposed method. Experimental results reveal that IGF-Bagging gets significant improvement of the classification accuracy compared with other six methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Entropy Models and Prepositional Phrase Ambiguity

Prepositional phrases are a common source of ambiguity in natural language and many approaches have been devised to resolve this ambiguity automatically. In particular, several different machine learning approaches have now reached accuracy rates of around 84.5% on the benchmark dataset. Maximum entropy (maxent) models, despite their successful application in many other areas of natural languag...

متن کامل

Rough Sets and Confidence Attribute Bagging for Chinese Architectural Document Categorization

Aiming at the problems of the traditional feature selection methods that threshold filtering loses a lot of effective architectural information and the shortcoming of Bagging algorithm that weaker classifiers of Bagging have the same weights to improve the performance of Chinese architectural document categorization, a new algorithm based on Rough set and Confidence Attribute Bagging is propose...

متن کامل

Bagging and Feature Selection for Classification with Incomplete Data

Missing values are an unavoidable issue of many real-world datasets. Dealing with missing values is an essential requirement in classification problem, because inadequate treatment with missing values often leads to large classification errors. Some classifiers can directly work with incomplete data, but they often result in big classification errors and generate complex models. Feature selecti...

متن کامل

Bagging Binary Predictors for Time Series

Bootstrap aggregating or Bagging, introduced by Breiman (1996a), has been proved to be effective to improve on unstable forecast. Theoretical and empirical works using classification, regression trees, variable selection in linear and non-linear regression have shown that bagging can generate substantial prediction gain. However, most of the existing literature on bagging have been limited to t...

متن کامل

A First Study on a Fuzzy Rule-Based Multiclassification System Framework Combining FURIA with Bagging and Feature Selection

In this work, we conduct a preliminary study considering a fuzzy rule-based multiclassification system design framework based on Fuzzy Unordered Rule Induction Algorithm (FURIA). This advanced method serves as the fuzzy classification rule learning algorithm to derive the component classifiers considering bagging combined with feature selection. We develop a study on the use of both bagging and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011